Goto

Collaborating Authors

 forward transfer







A Appendix

Neural Information Processing Systems

A.1 Case study of a learned continual learner We notice that when learning a new task, the first layer ( i.e., the lowest layer) At the same time, because each task has its specific and different characteristics, the lower layers of the structure also tend to use the "mask" action to control the output value of "mask", which shows that the higher levels have the ability to combine low-dimensional features. Therefore, there are more operations to "fuse" to combine the abilities of previous tasks. With/without "mask" means use (or don't use) the "mask" action. The results are summarized in Table 5. BNS is better overall than the baseline models in terms of forward transfer. This is due to the use of reinforcement learning. Each number is the sum of all task model parameters in each setting in the final network after all tasks have been learned.Model MNIST A Table 9: Training time (minutes) used by our BNS model and all baselines in each experiment.Model MNIST A


Disentangling Transfer in Continual Reinforcement Learning Maciej Wołczyk Faculty of Mathematics and Computer Science

Neural Information Processing Systems

We adopt SAC as the underlying RL algorithm and Continual World as a suite of continuous control tasks. We systematically study how different components of SAC (the actor and the critic, exploration, and data) affect transfer efficacy, and we provide recommendations regarding various modeling options.



Disentangling Transfer in Continual Reinforcement Learning

Neural Information Processing Systems

We adopt SAC as the underlying RL algorithm and Continual World as a suite of continuous control tasks. We systematically study how different components of SAC (the actor and the critic, exploration, and data) affect transfer efficacy, and we provide recommendations regarding various modeling options.